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ABSTRACT 


Naval  Aviation  aircraft  mishaps  continue  to  be  of  great 
concern  due  to  the  high  cost  of  loss  of  life  and  aircraft. 
The  goal  of  this  thesis  is  to  develop  a  predictive  statistical 
model  that  accurately  forecasts  Marine  Corps  AV-8B  Harrier 
aircraft  mishaps  based  on  existing  monthly  maintenance 
reports.  Monthly  maintenance  reports  provide  numerous 
independent  variables  based  on  personnel  levels  and 
maintenance  hours  that  could  possibly  be  used  to  forecast 
aircraft  mishaps.  These  variables  were  graphically  analyzed 
to  determine  any  relationships  that  could  be  exploited  in 
developing  the  model.  Higher  order  relationsh.-’ ps  were 
investigated  by  the  method  of  principal  components  and 
logistic  regression.  After  a  thorough  analysis,  there  appears 
to  be  no  combination  of  variables  in  this  particular  data  that 
could  be  used  to  forecast  aircraft  mishaps.  The  overall 
result  of  the  thesis  is  that  there  is  no  relationship  between 
monthly  maintenance  reports  and  aircraft  mishaps  that  can  be 
exploited  to  develop  a  predictive  statistical  model. 
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EXECUTIVE  SUMMARY 


Aircraft  mishaps  continue  to  be  a  major  concern  to  the 
Marine  Corps  due  to  the  high  costs  associated  with  the  loss  of 
life  and  aircraft.  A  predictive  statistical  model  or 
quantitative  formula  that  identifies,  on  the  basis  of  prior 
months  maintenance  reports,  a  squadron  at  risk  of  having  a 
mishap  would  greatly  enhance  the  commanding  officer's  ability 
to  prevent  mishaps.  This  thesis  attempts  to  develop  a 
predictive  statistical  model  which  identifies  high  risks 
squadrons  based  on  existing  monthly  maintenance  reports.  That 
IS,  we  want  to  attempt  to  identify  a  set  of  conditions  in 
previous  months  maintenance  records  which  presage  with  high 
probability  a  mishap  in  the  next  month.  Every  squadron  is 
required  to  submit  monthly  maintenance  reports  that  detail  the 
type  and  amount  of  maintenance  performed  on  each  aircraft  in 
that  month  and  report  maintenance  personnel  levels  within  the 
squadron.  Many  experienced  people  involved  in  Naval  Aviation 
believe  that  they  should  be  able  to  use  these  monthly  reports 
to  identify  a  squadron  at  risk  of  having  a  mishap. 

The  Marine  Corps  is  looking  for  a  predictive  statistical 
model  that  includes  all  aircraft  types,  but  because  of 
possible  different  operating  environments  and  procedures 
between  aircraft  types,  this  thesis  focuses  on  one  particular 
aircraft.  If  a  powerful  predictive  statistical  model  is 
developed  for  this  particular  aircraft,  then  there  is  hope 
that  the  analysis  and  the  statistical  model  could  be  expanded 
to  include  all  aircraft  types.  The  scope  of  the  thesis  has 
been  narrowed  to  developing  a  predictive  statistical  model  for 
the  Marine  Corps  AV-8B  Harrier  aircraft. 

The  overall  goal  of  the  predictive  statistical  model  is  to 
identify  high  risk  squadrons  based  on  existing  monthly 
maintenance  and  personnel  reports,  and  not  to  determine  the 
cause  of  mishaps.  The  statistical  model  will  not  determine  if 
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a  squadron  is  doing  the  correct  amount  of  maintenance  or  if 
the  squadron  is  adequately  manned,  but  rather  given  the 
reported  numbers,  is  the  squadron  at  high  risk  of  having  a 
mishap . 

The  predictive  statistical  model  will  be  developed  by 
determining  in  which  of  the  variables,  or  combination  of  the 
variables,  there  is  a  significant  difference  in  the  previous 
months  maintenance  pattern  of  a  mishap  and  a  non-mishap 
squadron.  These  variables  can  then  be  used  with  various 
statistical  prediction  and  classification  methods  to  attempt 
to  forecast  high  risk  squadrons. 

A  graphical  analysis  indicated  that  there  were  no  one  or 
two  dimensional  relationships  that  could  be  used  to  classify 
a  mishap  squadron.  And  furthermore,  the  techniques  of 
principal  components  and  logistic  regression  did  not  produce 
any  higher  order  relationships  that  could  be  used  to  classify 
a  mishap  squadron. 

Based  on  this  particular  analyzed  data  there  apparently  is 
no  relationship  between  existing  monthly  maintenance  reports 
and  aircraft  mishaps.  This  may  indicate  that  there  is  no 
relationship  between  the  level  of  maintenance  and  mishaps,  but 
the  results  also  might  indicate  that  a  monthly  generated 
report  may  not  be  useful  in  predicting  an  aircraft  mishap. 
The  fact  that  the  data  is  reported  once  a  month,  at  the  end  of 
the  month,  could  conceal  any  useful  subtle  changes  or 
indications  of  a  high  risk  squadron  that  occur  during  the 
month . 

Two  alternative  recommendations  are  evident.  The  first 
alternative  is  to  accept  that  there  may  be  no  exploitable 
relationship  between  monthly  maintenance  reports  and  aircraft 
mishaps  and  focus  elsewhere  to  determine  a  predictive 
statistical  model  that  forecasts  aircraft  mishaps.  The  second 
alternative  recommendation  is  that  further  analysis  be  done, 
possibly  attempting  to  use  daily  maintenance  reports  versus 
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monthly  maintenance  reports  to  determine  a  predictive 
statistical  model  that  forecasts  aircraft  mishaps. 
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I. 


INTRODUCTION 


A .  PROBLEM  STATEMENT 

Aircraft  mishaps  continue  to  be  a  major  concern  to  the 
Marine  Corps  due  to  the  high  costs  associated  with  the  loss  of 
life  and  aircraft.  A  predictive  statistical  model  or 
quantitative  formula  that  identifies,  on  the  basis  of  prior 
months  maintenance  reports,  a  squadron  at  risk  of  having  a 
mishap  would  greatly  enhance  the  commanding  officer's  ability 
to  prevent  mishaps.  This  thesis  attempts  to  develop  a 
predictive  statistical  model  which  identifies  high  risks 
squadrons  based  on  existing  monthly  maintenance  reports.  That 
is,  we  want  to  attempt  to  identify  a  set  of  conditions  in 
previous  months  maintenance  records  which  presage  with  high 
probability  a  mishap  in  the  next  month.  Every  squadron  is 
required  to  submit  monthly  maintenance  reports  that  detail  the 
type  and  amount  of  maintenance  performed  on  each  aircraft  in 
that  month  and  report  maintenance  personnel  levels  within  the 
squadron.  Many  experienced  people  involved  in  Naval  Aviation 
believe  that  they  should  be  able  to  use  these  monthly  reports 
to  identify  a  squadron  at  risk  of  having  a  mishap. 

The  following  is  a  problem  statement  from  a  September  1993 
Marine  Corps  aviation  safety  standdown: 

1.  Topic:  Identify  high  risk  aircraft  units. 

2.  Discussion:  Commanders  must  understand  and  use  all 
available  statistical  and  subjective  readiness  indicators 
to  evaluate  the  risk  level  of  their  operational  aircraft 
units.  Many  readiness  indicators  are  available  for 
Commanders  to  effectively  evaluate  and  strengthen  unit 
readiness,  but  may  not  be  consistently  used.  Commander 
Marine  Forces  Pacific  (MARFORPAC)  recommends  the  Naval 
Safety  Center  develop  a  quantitative  formula  that  assigns 
risk  values  to  leading  indicators  which  can  be  used  to 
identify  high  risk  squadrons  and  forecast  and  manage 
risk . 

3 .  Action:  Safety  Division,  using  the  resources 

available  at  the  Naval  Postgraduate  School  and  Naval 
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Safety  Center,  develop  a  quantitative  formula  which 
assigns  risk  values  to  squadron  aircraft  utilization  rate, 
manning  rates,  mission  capable  rates.  Status  of  Resources 
and  Training  System  (SORTS)  data,  and  operations  tempo,  to 
identify  high  risk  squadrons.  [Ref.  1] 

The  Marine  Corps  is  looking  for  a  predictive  statistical 
model  that  includes  all  aircraft  types,  but  because  of 
possible  different  operating  environments  and  procedures 
between  aircraft  types,  this  thesis  focuses  on  one  particular 
aircraft.  If  a  powerful  predictive  statistical  model  is 
developed  for  this  particular  aircraft,  then  there  is  hope 
that  the  analysis  and  the  statistical  model  could  be  expanded 
to  include  all  aircraft  types.  The  scope  of  the  thesis  has 
been  narrowed  to  developing  a  predictive  statistical  model  for 
the  Marine  Corps  AV-8B  Harrier  aircraft. 

There  are  obviously  thousands  of  influences  on  aircraft 
mishaps  but  this  thesis  focuses  on  just  existing  monthly 
maintenance  reports.  It  is  conjectured  that  probably  the 
greatest  influence  on  aircraft  mishaps  is  that  of  the 
commanding  officer's  attitude  concerning  safety.  However  this 
is  impossible  to  quantify  and  is  not  included  in  this  study. 
The  operations  tempo  of  a  squadron  may  also  greatly  influence 
mishaps  but  is  difficult  to  quantify,  even  as  a  categorical 
variable,  and  an  acceptable  operations  tempo  variable  was  not 
found  to  include  in  this  thesis.  For  the  preceding  reasons, 
any  model  developed  may  not  be  a  powerful  model  in  forecasting 
mishaps,  but  could  be  used  as  a  tool  for  commanding  officers 
to  help  identify  a  squadron  at  risk  of  having  a  mishap. 

The  overall  goal  of  the  predictive  statistical  model  is  to 
identify  high  risk  squadrons  based  on  existing  monthly 
maintenance  and  personnel  reports,  and  not  to  determine  the 
cause  of  mishaps.  The  statistical  model  will  not  determine  if 
a  squadron  is  doing  the  correct  amount  of  maintenance  or  if 
the  squadron  is  adequately  manned,  but  rather  given  the 
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reported  numbers,  is  the  squadron  at  high  risk  of  having  a 
mishap . 

The  predictive  statistical  model  will  be  developed  by 
first  determining  in  wh: ch  of  the  variables,  oi  combinations 
of  the  variables,  there  is  a  significant  difference  in  th.e 
previous  months  maintenance  pattern  of  a  mishap  and  a  non- 
mishap  squadron.  These  variables  can  then  be  used  with 
various  statistical  prediction  and  classification  methods  to 
attempt  to  forecast  high  risk  squadrons. 

B.  HISTORICAL  BACKOROUMD 

A  Defense  Technology  Information  Center  search  did  not 

produce  any  related  references  on  the  topic  of  predicting 

aircraft  mishaps  based  on  monthly  maintenance  reports.  A 

report  titled  “Marine  Corps  Aviation  Mishap  Rate  Assessment 

Study"  dated  February  1992  includes  some  analysis  of  a  similar 

problem.  [Ref  2.]  The  study  attempted  to  explain  why  the 

Marine  Corps  1990  mishap  rate  was  alarmingly  high. 

One  section  of  the  study  tested  the  hypothesis  that  there 

exists  a  high  correlation  between  increases  in  Direct 

Maintenance  Man  Hours  per  flight  hour  and  the  increase  in 

mishap  rate  for  1990.  For  the  test,  data  on  Not  Mission 

Capable  Supply,  cannibalization,  aircraft  utilization,  and 

mishap  rates  were  presented  to  the  Naval  Safety  Center, 

Statistics  and  Mathematics  Department  for  analysis.  The  study 

team  was  not  able  to  demonstrate  a  correlation  between 

aircraft  utilization  rates  and  support  resources  as 

independent  variables  and  mishap  rate  as  the  dependent 

variable.  The  study  team  concluded: 

It  is  still  intuitively  appealing  that  there  is  a 
relationship  and  experts  in  the  field,  the  operators 
and  senior  officers,  firmly  believe  that  the 
relationship  is  valid.  [Ref.  2] 


3 


■  •  ■  •  ;  :y  t’  .  • 

1  1 ►  ■  • .  •  >  ■  ; .  » •  •  ■  :  •  ,  i '  ,  r .  ;  r  • 

;  •  .  »i.  !  :*  .  r.  ;  i  y  ij 

:•  i:.'  Tru-  i  *  ‘  • 

■;  i-r-t:::*;-  :  e  1  ..i-  ;  ;  nsr; :  f  be- wf*f>r.  ‘iif-  . 

f  1-.^  '<--s  ma  i  nt  ei:a:.  -e  rrurir.  i.  ■;!.-  j.,er  tliqr.' 

,  .i:.  :  ‘be  m?haf' 

Tins  thesis  api;:uaches  the  latter  alteinative,  The 
pieviisiis  s-ud'y’  includes  all  Marine  Corps  aircraft  combined  and 
r  -cused  sni  -he  relationship  with  mishap  rate.  This  thesis  is 
defined  more  in  that  it  focusses  on  one  particular  aircraft 
aid  attempts  to  predict  mishaps,  rather  than  mishap  rates. 

C .  TECHNICAL  BACKGROUND 

The  goal  of  any  statistical  model  developed  would  be  to 
accurately  classify  a  squadron  as  a  mishap  or  non-mishap 
squadron  in  the  next  month  based  on  the  current  month 
maintenance  reports.  The  monthly  maintenance  report  data 
consists  of  numerous  maintenance  variables  that  are  believed 
to  possibly  influence  mishaps.  Hopefully,  a  function  can  be 
developed  which  uses  these  predictor  variables  to  classify  a 
squadron  as  a  mishap  squadron.  Therefore,  a  discriminate 
function  is  needed  that  projects  some  combination  of  the 
predictor  variables  to  a  decision  space  that  classifies  the 
squadron  as  a  mishap  squadron  or  not.  An  example  is  the 
following  linear  additive  model: 

^  fix)  =  fia^x^  +  32^2  +  .  .  .  +  a„  (1) 

where,  if  =  decision  space  (in  k-space) 

=  ith  independent  predictor  variable 
=  ith  coefficient 
i  ~  1,2,  .  .  .,  XI . 
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In  :rhei  words,  if  given  the  function  f(x)  and  a  new  set 
t  variables  x,  the  model  would  either  classify  a 

sguadior.  as  an  element  of  the  acceptance  region  of  the  mishap 
i-r^  i.pace.  .  oi  not.  A  graphical  explanation  is  shown 

r  .  I  .:-  1  The  idea  is  to  develop  a  function  that  maps  the 
.H  t  independent  predictor  variables  to  an  outcome,  or 
ie  i.iion  space,  that  is  partitioned  into  an  accept  and  reject 
i-i-gi^.n  as  to  determine  if  a  mishap  may  occur. 


Product  apaco  (rt~space) 


Outcome 


Figure  1.  Mapping  n-space  to  the  outcome  space. 

Identifying  the  function  capable  of  this  classification  is 
not  the  only  problem.  Any  statistical  model  developed  from 
this  function  must  be  accurate  in  its  forecast  so  that  the 
model  will  be  useful.  But  the  statistical  model  also  needs  to 
minimize  the  probability  of  making  errors. 
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The  two  types  ot  errors  that  are  ot  concern  are  type  I  and 
type  II  errors.  A  type  I  error  is  defined  as  rejecting  that 
the  outcome  is  from  the  event  population,  when  it  actually  is 
from  the  event  population.  In  this  statistical  model  a  type 

I  error  is  when  a  squadron  is  classified  as  a  non-mishap 
squadron  when  it  is  actually  a  mishap  squadron.  The 
probability  of  a  type  I  error  is  given  by 

a  =  Pr  {predict  non-mishap  I  actually  a  mishap)  . 

A  type  II  error  is  defined  as  accepting  that  the  outcome 
IS  from  the  event  population  when  it  actually  is  not  from  the 
event  population.  In  this  statistical  model  a  type  II  error 
IS  when  a  squadron  is  classified  as  a  mishap  squadron  when  it 
IS  actually  not  a  mishap  squadron.  The  probability  of  a  type 

II  error  is  given  by 

P  =  Pr  {predict  mishap  !  actually  no  mishap)  .  (3) 

Obviously  the  type  I  error  is  the  more  serious  of  the  two 
errors  in  this  statistical  model  since  a  mishap  occurs  that 
was  not  predicted.  But  a  high  probability  of  a  type  II  error, 
although  no  mishap  occurred,  can  render  the  model  useless.  If 
the  probability  of  a  type  II  error  is  high,  it  means  that  the 
model  IS  predicting  an  unacceptable  number  of  squadrons  as 
mishap  squadrons  when  they  are  non-mishap  squadrons. 

Any  model  developed  needs  to  minimize  the  probabilities  of 
the  type  I  and  type  II  errors  as  much  as  possible,  while  still 
providing  accurate  predictions.  The  two  types  of  errors  are 
interrelated  in  that  if  one  type  of  error  is  minimized  it  is 
usually  at  the  expense  of  the  other.  Generally,  if  the 
probability  of  a  type  I  error  is  minimized,  while  ignoring  the 
probability  of  a  type  II  error,  the  probability  of  making  a 
type  I  error  may  be  satisfactory  but  the  probability  of  making 
a  type  II  error  will  be  unsatisfactorily  high.  In  this 
statistical  model  this  may  result  in  an  acceptable  level  of 
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type  I  errors,  failing  to  predict  a  mishap  when  a  mishap 
actually  occurs,  but  an  unacceptable  level  of  type  II  errors, 
predicting  a  mishap  when  a  mishap  did  not  occur.  Obviously 
the  type  I  error  would  be  the  lowest  if  all  squadrons  were 
predicted  as  mishap  squadrons,  because  there  would  be  no  type 
I  errors.  But  the  type  II  errors  would  be  maximized,  since 
most  of  the  squadrons  would  have  a  false  alarm,  rendering  the 
model  useless . 

Dividing  the  data  into  mishap  and  non-mishap  observations 
creates  two  separate  populations  with  numerous  independent 
predictor  variables.  Marginal  analysis  of  each  of  these 
univariate  independent  predictor  variables  from  the  separate 
populations  can  determine  if  thcro  exists  a  significant 
difference  between  a  mishap  and  non-mishap  squadron  with 
respect  to  that  particular  variable  alone.  For  example,  maybe 
the  classification  is  a  function  of  just  one  variable,  i.e. 

-  f(x)  =  f  (a„  x„)  .  (4) 

To  determine  if  there  is  a  significant  difference  in  the 
distribution  of  a  variable  among  two  populations  it  is  assumed 
that  the  two  populations  have  similar  distributions  with 
possibly  different  parameters.  To  graphically  show 
differences,  the  density  traces  of  the  variables  from  each 
population  are  superimposed  on  the  same  density  plot.  Any 
significant  differences  can  be  determined  by  comparing  the  two 
t  races . 

For  example,  this  technique  could  be  used  if  trying  to 
determine  significant  differences  in  a  predictor  variable  from 
separate  populations,  non  event  and  event  observations. 
Figure  2  shows  two  superimposed  density  traces  of  a  variable 
from  two  separate  populations  that  show  the  obvious 
significant  difference  of  the  event  observations  variable 
being  larger  than  the  non  event  observation  variable.  In  this 
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example  the  plotted  variable  could  possibly  be  used  to 
classify  an  observation  as  an  event  or  non  event  by  setting 
the  rejection  region  at  w.  Thereby  accepting  that  a  new  set 
of  values  come  from  the  non  event  population  if  the  outcome  is 
less  than  w.  As  can  be  seen  in  this  example,  a  model  using 
the  example  variable  would  be  very  powerful,  with  a  low 
probability  of  both  types  of  error.  But  if  the  density  traces 
shift  so  that  they  are  now  overlapping  more,  then  using  the 
same  w  will  result  in  the  exact  same  type  I  error  while  the 
type  II  error  will  increase  dramatically. 


lO 
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Figure  2.  Density  trace  comparisons  of  one 
dimensional  data  with  a  significant  difference 
in  population  density. 
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On  the  other  hand,  Figure  3  shows  two  superinposed  density 
traces  of  a  variable  from  two  separate  populations  that  show 
no  obvious  significant  differences  between  non  event  and  event 
observations.  In  this  example,  there  is  no  way  that  this 
variable  could  be  used  to  classify  a  squadron  as  a  mishap  or 
non-mishap  squadron  because  there  is  no  rejection  region  that 
can  be  identified  that  could  be  used  to  distinguish  between 


in 


Figure  3 .  Density  trace  comparisons  of  data  with 
no  significant  difference  in  population  density. 


the  two  populations  with  a  high  degree  of  accuracy. 

The  above  discussion  uses  just  an  analysis  of  the 
univariate  independent  predictor  variables  to  attempt  to 
classify  an  observation  as  an  event  or  a  non  event.  It  is 
also  possible  that  combinations  of  independent  variables  may 
produce  the  function  that  classifies  the  dependent  variable  as 
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in  Equation  1.  Producing  a  coded  scatter  plot  of  each 
independent  variable  versus  each  other  independent  variable 
may  produce  a  clustering  of  observations  that  could  be  used  to 
classify  the  dependent  variable  as  an  event  observation.  A 
coded  scatter  plot  provides  a  three  dimensional  display  by 
having  the  two  independent  predictor  variables  plotted  against 
each  other  and  having  separate  symbols  showing  event  and  non 
event  observations.  This  provides  an  easy  way  to  determine  if 
any  observations  are  clustering,  i.e.,  if  most  of  the  event 
observations  are  grouped  together  it  shows  that  the 
combination  of  variables  may  produce  a  model  that  can  classify 
the  observation  as  an  event  or  non  event. 

Figure  4  shows  an  example  of  two  independent  variables,  x 
and  y,  that  are  being  used  to  attempt  to  discriminate  between 
two  populations  on  the  basis  of  x  and  y.  A  plot  of  x  and  y 
with  the  two  separate  populations  coded  could  show  any 
clustering  of  the  dependent  variable.  As  can  be  seen  in 
Figure  4,  there  is  no  rejection  region  that  can  be  used  to 
separate  the  two  populations  and  classify  an  event  or  non 
event  with  a  high  degree  of  accuracy. 

Figure  5  shows  that  when  the  observations  are  from  the 
event  population  all  of  the  observations  are  in  a  tight  and 
separated  cluster.  This  shows  the  possibility  of  using  x  and 
y  to  classify  an  observation  as  an  event  or  non  event.  As  can 
be  seen  in  Figure  5,  by  setting  the  rejection  region  at  the 
indicated  line,  the  event  and  non  event  observations  can  be 
accurately  classified.  For  example,  the  indicated  rejection 
line  in  Figure  5  is  a  function  of  x  and  y  that  maps  to  a  point 
in  a  two-space  decision  space 

-  f{x,y)  =  fia^  X  *  ay  y)  (5) 

where,  a,  and  are  the  coefficients  of  x  and  y. 
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Figure  4.  Coded  scatter  plots  showing  no  breakout 
or  clustering  of  event  observations. 

So,  given  any  x  and  y,  the  function  will  map  the 
observation  onto  the  decision  space  and  if  the  point  lies 
below  the  acceptance  region  dividing  line  then  that 
observation  is  classified  as  an  event.  Whereas,  if  the  point 
lies  above  the  acceptance  region  dividing  line  then  that 
observation  is  classified  as  a  non  event. 

Obviously,  higher  order  combinations  of  the  function  can 
provide  the  predictive  statistical  model.  Instead  of 
graphical  analysis,  the  higher  order  functions  are 
investigated  by  multivariate  techniques  such  as  discriminate 
analysis,  logistic  regression,  and  cluster  analysis. 
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Figure  5.  Coded  scatter  plot  showing  a  significant 
breakout  or  clustering  of  event  observations. 
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II.  DATA  OVERVIEW 


A .  DATA  DESCRIPTION 

1 .  Mishap  Data 

The  aircraft  mishap  data  was  provided  by  Headquarters 
Marine  Corps  Aviation  Safety  Division  and  includes  data  on 
nine  AV-8B  Harrier  squadrons  over  the  time  period  of  January 
1990  to  November  1993.  The  data  consisted  of  the  date, 
severity,  squadron,  and  brief  description  of  all  Flight 
Mishaps  involving  Harriers  in  this  period.  A  naval  aircraft 
Flight  Mishap  is  defined  as  an  unplanned  event  directly 
involving  naval  aircraft  which  there  was  $10,000  or  greater 
aircraft  damage,  or  loss  of  aircraft,  and  intent  for  flight 
existed  at  the  time  of  the  mishap.  Table  I  shows  the 
definitions  of  the  mishap  severity  classes  based  on  personal 
injury  and  property  damage.  Any  occurrence  in  which  total 
cost  of  property  damage  is  less  than  $10,000  and  there  are  no 
defined  injuries,  is  not  considered  a  reportable  naval 
aircraft  mishap. 

The  description  of  the  mishap  is  an  excerpt  from  the 
Mishap  Investigation  Report  that  provides  a  short  narration  of 
the  causal  factors  of  the  mishap.  The  causal  factors  can  be 
divided  into  three  basic  categories.  The  first  is  mishaps 
caused  by  human  factors,  i.e.,  human  error  by  the  aircrew, 
supervisory  personnel,  maintenance  personnel,  or  facilities. 
The  second  factor  is  a  material  failure,  i.e.,  a  component 
fails  causing  the  mishap.  And  the  last  is  mishaps  caused  by 
an  aircraft  hitting  a  bird. 

All  three  severity  classes  of  mishaps  (A,  B,  and  C)  were 
combined  to  form  a  dependent  variable  that  indicates  if  a 
squadron  had  a  mishap  in  a  month  or  did  not  have  a  mishap  in 
that  month.  All  casual  factors  were  combined  except  for  the 
birdstrike  mishaps.  Since  there  is  no  credible  way  to  predict 
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Ml shap 
Severity 

Description 

Class  A 

A  mishap  in  which  the  total  cost  of 
property  damage  is  $  1,000,000  or 
greater;  or  a  naval  aircraft  is 
destroyed  or  missing;  or  any  fatality  or 
permanent  total  disability  occurs  with 
direct  involvement  of  naval  aircraft. 

Class  E 

A  mishap  in  which  the  total  cost  of 
property  damage  is  $  200,000  or  more, 
but  less  than  $  1,000,000;  or  a 
permanent  partial  disability,  or 
hospitalization  of  five  or  more 
personnel . 

C  1 3  S  3  C 

A  mishap  in  which  the  total  coat  of 
property  damage  is  $  10,000  or  more,  but 
less  than  $  200,000;  or  injury  results 
in  one  or  more  lost  workdays . 

Table  I.  Classifications  of  Naval  Aircraft  Mishaps. 

From  Ref  [ 3 ] . 

birdstrike  mishaps,  they  were  not  considered  a  mishap  month  in 
the  analysis.  All  of  the  remaining  mishaps  observations  were 
included  in  belief  that  the  mishap  observations  and 
independent  predictor  variables  could  be  used  to  develop  a 
statistical  model  that  can  discriminate  a  mishap  and  non¬ 
mishap  squadron  based  on  monthly  maintenance  reports.  In 
three  separate  cases  a  squadron  that  had  two  mishaps  in  the 
same  month  was  included  as  a  single  observation  of  a  m.ishap 
month . 

2 .  Maintenance  Data 

The  maintenance  data  was  provided  by  the  Naval  Safety 
Center  through  the  Naval  Aviation  Logistic  Data  Analysis 
system.  This  data  consisted  of  the  Equipment  Condition 
Analysis  report  and  the  maintenance  man  hours  per  flight  hour 
for  the  nine  squadrons.  The  Equipment  Condition  Analysis 


14 


report  data  consisted  of  the  reported  Aviation  Maintenance  and 
Material  Management  (3M)  system  data  for  each  squadron  in  each 
month.  The  amount  of  maintenance  hours  is  divided  into 
separate  categories  based  on  the  information  on  the 
Maintenance  Action  Form.  The  Maintenance  Action  Form  is  the 
paperwork  that  describes  particular  maintenance  to  be  done  and 
assigns  the  maintenance  to  the  appropriate  work  center  [Ref. 
4) .  Included  in  this  data  for  each  squadron  is; 

1.  Date  by  month  from  January  1990  to  November  1993. 

2.  Average  number  reporting  inventory:  average  number 
of  aircraft  assigned  in  each  month. 

3.  Flight  hours;  total  flight  hours  in  each  month. 

4.  Number  sorties:  total  number  of  flights  in  each 
month . 

5 .  Number  landings :  total  number  of  landings  in  each 
month . 

6.  Hours  Equipment  in  Service:  total  number  of  hours 
that  the  aircraft  were  available  for  use  in  each  month. 

7.  Hours  Not  Mission  Capable  Maintenance-Scheduled: 
total  number  of  hours  that  an craft  were  not  capable  of 
performing  any  of  their  missions  due  to  scheduled 
maintenance  requirements  in  each  month.  Scheduled 
maintenance  is  the  periodic  prescribed  inspection/ 
servicing  of  equipment,  done  on  a  calendar  or  hours  of 
operation  basis.  An  aircraft  is  considered  Not  Mission 
Capable  Maintenance-Scheduled  only  if  panels  and 
equipment  removed  to  conduct  area  inspections  cannot  be 
replaced  within  two  hours. 

8.  Hours  Not  Mission  Capable  Maintenance-Unscheduled: 
total  number  of  nours  that  aircraft  were  not  capable  of 
performing  any  of  their  missions  due  to  unscheduled 
maintenance  requirements  in  each  month.  All  not  mission 
capable  maintenance  hours  that  are  not  Not  Mission 
Capable  Maintenance-Scheduled  are  classified  as  Not 
Mission  Capable  Maintenance-Unscheduled.  Unscheduled 
maintenance  is  performed  when  corrective  maintenance  is 
required . 
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9.  Hours  Not  Mission  Capable  Supply:  total  number  of 
hours  that  aircraft  were  not  capable  of  performing  any 
of  their  missions  because  maintenance  required  to  clear 
the  discrepancy  cannot  continue  due  a  supply  shortage. 

10.  Hours  Partially  Mission  Capable  Maintenance- 
Unscheduled:  total  number  of  hours  that  the  aircraft 
were  capable  of  performing  at  least  one,  but  not  all  of 
their  missions  due  to  unscheduled  maintenance 
requirements  in  each  month. 

11.  Hours  Full  Mission  Capable  Maintenance-Unscheduled: 
total  number  of  hours  that  aircraft  were  capable  of 
performing  all  of  their  missions  but  are  not  at  optimum 
performance  due  to  unscheduled  maintenance  requirements 
in  each  month. 

12.  Maintenance  Man  Hour  per  Flight  Hour:  average 
number  of  hours  of  maintenance  done  per  flight  hour  in 
each  month.  Derived  by  dividing  total  maintenance  hours 
by  total  hours  flown. 


The  maintenance  data  was  reduced  somewhat .  The  number  of 
landings  was  obviously  highly  correlated  with  the  number  of 
sorties,  therefore  the  number  of  landings  was  omitted  since 
the  number  of  sorties  provides  essentially  the  same 
information.  The  hours  Equipment  in  Service  was  perfectly 
correlated  with  the  average  number  of  aircraft  assigned  since 
the  total  hours  equipment  in  service  is  the  average  number  of 
aircraft  multiplied  by  the  total  number  of  hours  in  the  month. 
Therefore  the  hours  equipment  in  service  was  not  included  in 
the  analysis.  If  a  squadron  had  numerous  missing  data  in  a 
particular  month  that  month  was  deleted  from  the  data.  And, 
if  the  amount  of  flight  hours  in  a  month  was  less  than  100, 
then  that  month  was  deleted  since  that  month  was  obviously  not 
a  normal  operating  month  and  may  skew  any  results  of  the 
analysis . 

3 .  Personnel  Data 

The  personnel  data  was  provided  by  Headquarters  Marine 
Corps  and  consisted  of  the  number  of  each  maintenance  related 
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Military  Occupational  Specialty  in  each  squadron  in  each 
month.  Eight  squadrons  were  included  in  this  data.  The  data 
provided  was  the  number  of  each  specialty,  and  was  not 
compared  with  the  squadron  Table  of  Organization  to  determine 
if  a  squadron  was  manned  at  a  level  consistent  with  the  Table 
of  Organization .  The  data  consisted  of  quarterly  data  from 
January  1990  to  December  1992  and  monthly  data  from  February 
1993  to  November  1993.  The  month  of  January  1993  was  missing 
from  the  data.  The  following  is  the  brief  description  of  the 
provided  Military  Occupational  Specialties: 

1.  Aircraft  Mechanic:  responsible  for  engine  repair, 
daily  inspection,  and  launching  and  recovering  aircraft. 

2.  Aircraft  Maintenance  Chief:  senior  enlisted  person  in 
maintenance  department.  Usually  only  a  couple  in  entire 
squadron,  one  as  maintenance  chief,  responsible  for 
overseeing  the  department,  and  one  as  a  the  maintenance 
control  chief,  responsible  for  assigning  maintenance  on  a 
particular  aircraft  to  the  responsible  work  center. 

3  .  Aircraft  Maintenance  Administrative  Clerk  and  Aircraft 
Maintenance  Data  Analysis  Technician:  responsible  for 
tracking  maintenance  and  preparing  required  reports. 

4.  Aircraft  Maintenance  Hydraulics  and  Pneumatics 
Mechanic;  responsible  for  maintenance  of  the  hydraulic 
systems  and  aircraft  body  maintenance. 

5.  Flight  Equipment  Marine:  responsible  for  maintenance 
of  aircrew  personal  flight  equipment. 

6.  Aircraft  Maintenance  Ground  Support  Equipment 
Mechanic:  responsible  for  maintenance  on  ground  support 
equipment  used  in  the  maintenance  of  the  aircraft . 

7.  Aircraft  Safety  Equipment  Mechanic:  responsible  for 
maintenance  of  ejection  seats  and  environmental  systems. 

8.  Aircraft  Communications /Navigation  System  Technician : 
responsible  for  maintenance  of  communications/navigation 
and  related  systems. 

9.  Aircraft  Electrical  System  Technician:  responsible 
for  maintenance  of  electrical  systems. 
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10.  Avionics  Maintenance  Chief:  senior  enlisted  in 

avionics  division. 

11.  Aircraft  Ordnance  Technician:  responsible  for 

ordnance  deliveiy  systems  and  loading  of  ordnance. 

12.  Aviation  Ordnance  Chief:  senior  enlisted  in  ordnance 
division . 

All  twelve  specialties  were  included  in  the  analysis, 
although  it  is  doubtful  that  some  of  them  would  effect 
aircraft  mishaps.  The  aircraft  maintenance  chief,  avionics 
chief,  and  ordnance  chief  specialties  probably  will  not  be 
significantly  different  between  mishap  and  non-mishap 

squadrons  since  all  squadrons  have  just  one  or  two  of  these 
specialties  and  are  almost  always  manned.  The  data  analysis 
section,  the  flight  equipment  section,  ground  support  section, 
and  safety  equipment  section,  probably  will  not  be 
significantly  different  between  mishap  and  non-mishap 

squadrons  since  maintenance  performed  by  these  sections  is 
highly  specialized  and  is  rarely,  if  ever,  considered  a  causal 
factor  in  an  aircraft  mishap. 

B .  DATA  REDUCTION 

1 .  One  Month  Lag 

All  of  the  above  data  are  contained  in  reports  that  are 
generated  at  the  end  of  the  month  being  reported  upon.  Hence 
this  data  is  not  useful  in  trying  to  predict  a  mishap  in  that 
month  since  the  month  is  already  past.  Also,  a  squadron  that 
has  a  mishap  will  sometimes  drastically  change  their  operating 
procedures,  obviously  effecting  the  maintenance  reports  for 
that  month.  For  the  preceding  reasons  the  squadron  reported 
maintenance  figures  for  each  month  were  used  as  independent 
variables  to  attempt  to  predict  a  mishap  squadron  in  the  next 
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month.  Basically  creating  maintenance  variables  with  a  one 
month  lag  as  predicting  variables  for  a  mishap  in  the  month. 

2 .  Pinal  Data 

The  original  data  set  contained  approximately  432 
observations  (nine  squadrons  x  48  months  of  data)  that  had  54 
mishap  observations  and  378  non-mishap  observations.  Each 
observation  consisted  of  a  month  with  a  binary  dependent 
variable  indicating  if  a  mishap  occurred  or  not,  and  23 

possible  independent  predictor  variables.  After  the  above 
reductions  in  the  data,  the  final  data  set  used  in  the 
analysis  contained  368  observations  that  had  44  mishap 

observations  and  324  non-mishap  observations.  Each 

observation  includes  the  binary  dependent  variable  and  21 
possible  independent  predictor  variables. 

3 .  Model  Formulation 

The  final  data  set  and  model  of  the  problem  can  be 

considered  similar  to  Anderson's  Iris  Data  made  famous  by 
Fisher  [Ref.  5].  In  that  data  set  there  were  measurements 
from  three  varieties  of  flowers  and  the  problem  was  to  develop 
a  model  and  a  procedure  that  would  classify  a  particular 
flower,  as  one  of  the  three  varieties.  The  data  set  consisted 
of  a  set  of  four  measurements  on  each  of  150  flowers;  the 
sample  contained  50  flowers  of  each  variety  of  flower.  So 
this  data  may  be  regarded  as  150  four-dimensional  observations 
in  four-dimensional  space.  The  goal  of  a  model  is  to  develop 
a  function  that  maps  the  observations  from  four  dimensional 
space  to  some  outcome  space  that  will  enable  the 
classification  of  the  flower  in  a  particular  category.  In 
this  example,  by  plotting  petal  length  versus  petal  width,  and 
coding  each  observation,  an  obvious  clustering  of  type  of 
flowers  is  shown  that  can  be  used  to  classify  each  flower. 
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The  final  mishap  data  set  is  somewhat  similar  to  the  above 
example,  but  obviously  more  complex.  The  final  data  set  was 
a  set  of  21  measurements  on  each  of  368  separate  monthly 
observations.  The  21  measurements  include  all  of  the 
personnel  and  maintenance  figures  discussed  previously,  for 
that  particular  month.  The  sample  contained  324  non-mishap 
monthly  observations  and  44  mishap  monthly  observations.  The 
data  can  then  be  regarded  as  3  68  twenty-one  dimensional 
observations  in  twenty-one  dimensional  space. 
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HI.  DATA  ANALYSIS 


A.  APPROACH  TO  ANALYSIS 

The  approach  to  analysis  was  to  first  perform  a  one- 
dimensional  graphical  marginal  analysis  of  each  independent 
predictor  variable.  A  density  trace  from  each  population, 
mishap  and  non-mishap,  for  each  independent  predictor  variable 
was  superimposed  upon  each  other  to  determine  any  significant 
differences  in  the  two  populations.  As  discussed  earlier,  if 
any  of  the  independent  predictor  variables  indicate  a 
significant  difference  between  the  mishap  and  non-mishap 
population,  that  variable  or  variables,  could  be  used  to 
discriminate  an  observation  as  a  mishap  or  non-mishap 
squadron . 

Following  the  one-dimensional  analysis  a  two-dimensional 
graphical  analysis  of  the  independent  predictor  variables  will 
be  performed  to  determine  any  pair  of  predictor  variables  that 
can  be  used  to  classify  a  squadron  as  a  mishap  or  non-mishap 
squadron.  All  pairs  of  the  possible  independent  predictor 
variables  will  be  plotted  in  coded  scatter  plots  to  determine 
which  pairs  of  variables  could  possibly  be  used  to  classify  a 
squadron  as  a  mishap  squadron.  If  any  of  the  coded  scatter 
plots  show  a  clustering  of  mishap  or  non-mishap  observations, 
then  these  pairs  of  independent  variables  could  possibly  be 
used  to  discriminate  between  mishap  or  a  non-mishap  squadron. 

Following  the  one  and  two-dimensional  graphical  analysis 
the  independent  predictor  variables  will  be  analyzed  in  higher 
dimensions  with  the  multivariate  techniques  of  principal 
components  and  logistic  regression  to  attempt  develop  the 
predictive  statistical  model.  These  techniques  will  discover 
any  higher  order  relationship  that  may  be  used  to  classify  a 
squadron  as  a  mishap  or  non -mishap  squadron. 
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All  graphical  output  was  produced  using  IBM's  A  Graphical 
Statistical  System  (AGSS)  [Ref.  6]  on  a  486DX-50  personal 
computer . 

B.  PERSONNEL  DATA  ANALYSIS 

The  twelve  military  occupational  specialties  considered 
were  plotted  on  density  trace  plots  to  determine  if  there  was 
a  first  order  significant  difference  in  the  distributions  of 
the  military  occupational  specialties  between  a  mishap 
squadron  and  a  non-mishap  squadron  manning  level.  All  of  the 
plots  reveal  that  there  is  no  discernable  area  (marginal) 
effect  between  a  mishap  squadron  and  a  non-mishap  squadron. 
All  of  the  density  traces  of  the  personnel  data  are  reproduced 
in  Appendix  A.  A  representative  plot  of  the  Aircraft  Mechanic 
specialty  is  shown  in  Figure  6.  As  can  be  seen,  there  is  not 
a  significant  difference  in  the  density  plots  of  aircraft 
mechanics  assigned  to  mishap  and  non-mishap  squadrons. 

The  manning  level  results  are  undoubtedly  highly 
influenced  by  the  fact  that  most  of  the  personnel  data  was 
reported  as  quarterly  figures.  Since  the  same  number  of 
personnel  was  reported  for  each  month  of  that  quarter,  the 
changes  between  mishap  and  non-mishap  squadrons  in  each  month 
was  not  distinguishable. 

It  bears  repeating  that  the  personnel  data  was  compared  by 
the  total  niimber  of  individuals  in  each  specialty.  This 
number  was  not  compared  to  the  Table  of  Organization  since  the 
goal  of  the  thesis  was  to  distinguish  between  a  mishap  and 
non-mishap,  and  not  to  determine  if  a  squadron  was  manned  at 
Table  of  Organization  level.  This  analysis  also  had  no  way  of 
analyzing  the  experience  level  of  the  individuals  assigned  to 
different  squadrons.  It  was  assumed  that  the  experience  level 
would  be  similar  among  squadrons,  which  may  or  may  not  be 
true.  And  obviously,  the  experience  level  among  the 
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Figure  6.  Density  traces  of  Aircraft  Mechanics 
assigned  to  each  squadron. 


maintainers  could  influence  the  chances  of  the  squadron  having 
a  mishap. 

Based  on  the  above  one-dimensional  analysis,  the  personnel 
data  was  not  considered  significant  and  therefore  was  not 
included  in  any  further  analysis. 

C.  MAINTENANCE  DATA  ANALYSIS 

The  marginal  analysis  of  the  ten  possible  maintenance 
predictor  variables  was  done  by  plotting  density  traces  of 
each  variable  to  determine  if  there  was  a  first  order 
significant  difference  in  the  distributions  of  the  variable 
between  a  mishap  squadron  and  a  non-mishap  squadron.  All  of 
the  density  trace  plots  of  the  maintenance  independent 
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predictor  variables  are  reproduced  in  Appendix  B.  None  of  the 
plots  revealed  any  discernable  area  (marginal)  effect  in  one 
dimension  between  a  mishap  and  non-mishap  squadron.  A 
representative  plot  of  Maintenance  Man  Hours  per  Flight  Hour 
IS  shown  in  Figure  7.  The  figure  clearly  shows  that  there  is 
not  a  significant  difference  between  the  maintenance  man  hours 
per  flight  hour  per  month  in  the  mishap  squadron  population 
and  non-mishap  squadron  population.  The  majority  of 
observations  fall  between  10  and  25  maintenance  man  hours  per 
flight  hour  with  no  way  of  separating  the  mishap  from  the  non- 
mishap  observations. 

The  one-dimensional  analysis  of  all  maintenance 
independent  predictor  variables  did  not  produce  any 


Figure  7 .  Density  trace  of  Maintenance  Man  Hours 
per  Flight  Hour. 
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significant  differences  that  could  be  used  to  classify  a 
squadron  as  a  mishap  or  non-mishap  squadron,  so  all  of  the 
independent  maintenance  predictor  variables  were  retained  and 
an  analysis  of  a  two-dimensional  relationship  was  performed. 

To  determine  any  two-dimensional  relationship,  all 
possible  pairs  of  the  ten  independent  maintenance  predictor 
variables  were  plotted  in  coded  scatter  plots.  A  coded 
scatter  plot  is  a  technique  in  which  each  independent  variable 
can  be  plotted  against  all  other  independent  variables  to 
determine  any  second  order  interaction  of  variables  that  could 
be  used  in  classifying  a  squadron  as  a  mishap  or  non-mishap 
squadron.  A  coded  scatter  plot  will  show  the  relationship 
between  the  two  predictor  variables,  as  well  as  any  possible 
relationship  to  predict  a  mishap,  i.e.,  separate  clustering  of 
observations  that  can  discern  between  mishap  and  non-mishap 
squadrons.  The  coded  scatter  plots  showed  no  discernable  area 
of  effect  that  ''ccld  be  used  in  discriminating  between  a 
mishap  and  non-mishap  squadron.  A  representative  plot  is 
shown  in  Figure  8  with  all  the  possible  pairs  of  plots 
reproduced  in  Appendix  C.  The  coded  scatter  plots  show  mishap 
and  non-mishap  months  as  well  as  identifying  the  training 
squadron  versus  the  regular  squadrons.  The  training  squadron 
is  shown  separately  to  determine  if  the  training  environment 
is  possibly  significant  in  determining  mishaps. 

In  Figure  8  the  total  Flight  Hours  of  a  squadron  are 
plotted  against  the  Maintenance  Man  Hours  per  Flight  Hour.  It 
is  obvious  that  the  training  squadron  produces  more  flight 
hours  each  month  and  has  a  slightly  higher  maintenance  man 
hours  per  flight  hour.  But  there  are  no  discernable  area  of 
effect  exclusive  to  a  mishap  or  non-mishap  squadron.  Ideally 
all  the  mishap  observations  would  be  clustered  together, 
separated  from  a  cluster  of  all  the  non-mishap  observations. 

From  the  above  two-dimensional  analysis  several 
transformations  of  the  original  independent  variables  were 
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Figure  8.  Coded  scatter  plot  of  Maintenance  Man 
Hours  per  Flight  Hour  versus  Total  Flight  Hours. 

suggested.  As  could  be  expected,  the  total  number  of  flight 
hours  and  total  number  of  sorties  a  squadron  flies  in  a 
particular  month  are  highly  correlated,  hence  are  providing 
the  same  information.  Therefore  number  of  sorties  was  dropped 
because  the  total  flight  hours  provides  essentially  the  same 
information  as  total  number  of  sorties. 

Since  the  training  squadron  is  always  assigned  more 
aircraft  and  the  other  squadrons  total  assigned  aircraft  can 
vary  significantly,  the  total  flight  hours  may  be  skewed 
somewhat.  Therefore  the  total  flight  hours  flown  in  each 
month  were  divided  by  the  total  aircraft  assigned  that  month, 
to  form  a  new  univariate  independent  predictor  variable  of 
average  flight  hours  per  aircraft  assigned  in  each  month. 
This  new  independent  predictor  variable  is  basically  an 
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indicator  of  the  utilization  rate  of  the  aircraft  in  a 
squadron . 

Many  of  the  different  maintenance  predictor  variables  were 
spread  over  a  wide  range  because  of  a  few  unusually  high  or 
low  reported  maintenance  months.  These  months  could  not  be 
considered  outliers,  so  all  maintenance  predictor  variables 
were  transformed  by  taking  the  logarithm  of  the  variable, 
providing  a  more  presentable  plot,  without  changing  any  of  the 
existing  relationships. 

As  before,  a  one-dimensional  marginal  analysis  was 
performed  on  the  transformed  independent  predictor  variables. 
A  representative  density  trace  of  Flight  Hours  per  Aircraft  is 
shown  in  Figure  9,  with  the  remaining  density  traces  of  the 
transformed  independent  predictor  variables  reproduced  in 
Appendix  D.  The  plot  clearly  shows,  as  well  as  all  other 
plots,  that  there  is  no  discernable  area  of  effect  between 
flight  hours  per  aircraft  in  the  mishap  squadron  population 
and  non-mishap  squadron  population. 

A  two-dimensional  analysis  was  then  performed  on  the 
transformed  predictor  variables  using  coded  scatter  plots  to 
determine  any  significant  pairs  of  predictor  variables.  The 
eight  transformed  independent  maintenance  predictor  variables 
were  plotted  in  a  coded  scatter  plot  so  that  each  independent 
variable  could  be  plotted  against  all  other  independent 
variables  to  determine  any  pair  of  variables  that  could  be 
used  in  classifying  a  squadron  as  a  mishap  or  non-mishap 
squadron.  The  plot  showed  no  discernable  area  of  effect  that 
could  be  used  in  discriminating  between  a  mishap  and  non¬ 
mishap  squadron.  A  representative  coded  scatter  plot  of  the 
logarithm  of  Maintenance  Man  Hours  per  Flight  Hour  versus 
Flight  Hours  per  Aircraft  is  shown  in  Figure  10,  with  the 
remaining  coded  scatter  plots  of  the  transformed  independent 
predictor  variables  reproduced  in  Appendix  E.  The  plot  shows 
mishap  and  non-mishap  months  as  well  as  identifying  the 
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Figure  9.  Density  trace  of  Flight  Hours  per 
Aircraft . 


training  squadron  versus  the  regular  squadrons.  The  training 
squadron  is  shown  separately  to  determine  if  the  training 
environment  is  significant  in  determining  mishaps.  Included 
in  each  of  these  plots  is  a  locally  weighted  regression 
scatter  plot  smoothing  (LOWESS)  function  to  help  indicate  any 
relationship  of  the  two  independent  variables.  [Ref.  7] 
Except  for  a  few  extreme  months,  the  utilization  rate  and  log 
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clustering  of  the  mishap  observations  separated  from  the  non¬ 
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Flight  Hour  decrease,  probably  due  to  the  fact  that  the 
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Figure  10.  Coded  scatter  plot  of  the  logarithm 
of  Maintenance  Man  Hours  per  Flight  Hour  versus 
Hours  per  Aircraft. 


aircraft  are  up  and  flying  more  and  not  breaking  or  possibly 
less  time  to  perform  maintenance. 

Based  on  the  above  one  and  two  dimensional  analysis  of  the 
original  and  transformed  predictor  variables,  there  does  not 
appear  to  be  any  discernable  relationships  that  could  be  used 
in  classifying  a  squadron  at  risk  of  having  a  mishap  based 
upon  the  existing  monthly  maintenance  reports.  Since  none  of 
the  independent  variables  were  determined  to  be  significant  in 
the  above  graphical  analysis,  all  of  the  transformed 
independent  maintenance  variables  were  retained  as  possible 
predictor  variables  for  an  analysis  of  higher  order 
interactions . 
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IV.  FORTHBR  ANALYSIS 


A.  PRINCIPAL  COMPONENTS 

Since  the  initial  graphical  analysis  did  not  reveal  any 
discernable  first  or  second  order  discriminate  function,  the 
method  of  principal  components  was  used  to  determine  if  any 
linear  combination  of  variables  exists  that  could  be  used  to 
classify  a  high  risk  squadron.  The  principal  component  are 
the  independent  linear  combinations  of  the  existing  variables 
that  maximize  the  variances. 

The  principal  components  method  in  effect  rotates  the 
coordinate  axes  of  the  data  to  a  new  coordinate  system  that 
has  inherent  statistical  properties.  This  is  a  way  of 
reducing  the  number  of  variables  to  be  considered  by 
discarding  linear  combinations  which  have  small  variances  and 
study  only  those  with  large  variances.  The  idea  is  to  focus 
on  the  largest  variances  between  the  variables  to  help 
discriminate  between  mishap  and  non-mishap  squadrons.  [Ref.  8] 

The  data  was  divided  into  two  separate  data  sets,  a  matrix 
M,  containing  all  the  maintenance  independent  predictor 
variables  from  the  mishap  observations  and  a  matrix  N, 
containing  all  the  maintenance  independent  predictor  variables 
from  the  non-mishaps  observations.  The  non-mishap 
observations  were  used  as  the  baseline  since  the  objective  of 
the  thesis  was  to  discriminate  between  mishap  and  non-mishap 
observations.  The  principal  components  method  was  applied  to 
the  data  of  non-mishap  observations  to  produce  a  matrix  of 
principal  component  coefficients,  P.  The  transpose  of  this 
matrix  was  then  multiplied  by  both  matrices  M  and  N,  therefore 
producing  matrices  whose  elements  are  the  baseline  component 
values  of  the  mishap  and  non -mishap  data,  P*M  =  M'and  P*'N  =  N'  . 
The  values  of  the  original  variables  are  projected  onto  the 
baseline  principal  axes.  To  see  if  these  component  values  are 
useful  for  classifying  squadrons  as  mishap  and  non-mishap 
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squadrons,  the  distributions  of  the  first  principal  component 
values  are  compared  for  significant  differences.  To  compare 
the  principal  components,  the  first  principal  components  of 
each  of  the  component  value  matrices  was  standardized  using 
the  mean  and  standard  deviation  of  the  non-mishap  observations 
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where, 

W ,,  is  the  standardized  first  principal  component  of 
the  non-mishap  predictor  variables. 

v\,  is  the  standardized  first  principal  component  of 
the  mishap  predictor  variables. 

n' '  and  s„.  are  the  average  and  standard  deviation  of 
the  first  principal  component  of  the  non-mishap 
predictor  variables. 

and  are  the  individual  entries  in  the  first 

column  of  the  two  principal  component  matrices. 

These  standardized  first  principal  components  are  then 
superimposed  on  a  density  trace  plot.  Any  significant 
difference  in  the  two  densities  of  the  plot  would  indicate  a 
transformation  of  axes  that  could  be  exploited  to  classify  the 
observations  as  mishap  or  non-mishap. 

Figure  11  shows  the  resulting  standardized  first  principal 
component  plot  of  the  transformed  independent  predictor 
variables.  Although  there  is  some  difference  shown,  there  is 
no  discernable  difference  that  could  be  used  to  discriminate 
a  mishap  and  non-mishap  squadron.  Therefore  the  method  of 
principal  components  indicates  that  there  may  not  exist  a 
linear  additive  model  of  the  independent  predictor  variables 
that  could  be  used  to  classify  a  mishap  or  non-mishap 
squadron . 
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STANDARDIZED  lat  PRINCIPAL  COMPONENTS 


Figure  11.  Density  trace  of  standardized  1st 
principal  components  of  the  transformed  data. 

B .  LOGISTIC  REGRESSION 

To  continue  to  develop  a  predictive  statistical  model  the 
method  of  logistic  regression  was  pursued.  Logistic 
regression  uses  a  linear  logistic  transformation  function 
that  calculates  the  logarithm  of  the  odds  of  an  event 
occurring,  or  the  ratio  of  the  probability  of  success  to  the 
probability  of  failure.  That  is,  the  likelihood  that  an  event 
will  occur  given  a  particular  set  of  predictor  variables.  The 
logit  model  takes  on  the  form  [Ref.  9] 
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'  1  » 
or 

log[  ]  =  o  +  P 

where  =  probability  of  an  event  occur ing 
=  attributes  of  an  event 
P  =  coefficients  vector 
a  =  scalar. 


Although  the  individual  probability  of  an  event  occurring, 
Pi,  are  not  known,  the  information  for  each  observation  is 
whether  an  event  occurred  or  did  not  occur.  The  measured 
dependent  variable  is  =  1,  if  an  event  occurred,  and  = 
0,  if  no  event  occurred.  This  dependent  variable  is  used  with 
a  maximum  likelihood  estimation  for  the  logit  model  to 
estimate  a  and  6  for  the  model.  [Ref.  10]  Results  from  the 
predictive  statistical  model  provide  an  estimated  forecast  of 
the  probability  of  an  event  observation  occurring  based  upon 
a  particular  set  of  attributes.  Using  a  selected  critical 
probability,  any  set  of  attributes  can  be  classified  as  an 
event  or  non  event  observation  based  upon  the  log  odds 
calculated  by  the  predictive  model.  The  critical  probability 
should  be  selected  so  that  type  I  errors  are  minimized  while 
maintaining  an  accurate  predictive  model. 

A  logistic  regression  of  the  aircraft  mishap  data  was 
performed  in  attempt  to  produce  a  predictive  statistical  model 
to  forecast  aircraft  mishaps.  Figure  12  shows  the 
superimposed  plot  of  the  log  odds  of  the  mishap  and  non-mishap 
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observations.  In  this  plot  the  forecasted  log  odds  is  the 
odds  of  each  observation  being  classified  as  a  non-mishap 
observation.  For  example,  given  a  set  of  predictor  variables 
from  a  particular  squadron,  the  plot  shows  the  log  odds  of 
that  squadron  being  classified  as  a  non-.T\ishap  squadron.  As 
can  be  seen,  Che  log  odds  of  classifying  the  observations  as 
a  non-mishap  squadron  fall  between  0.73  and  0.99,  for  both 
mishap  and  non-mishap  observations.  This  indicates  that  the 
predictive  model  has  a  high  probability  of  classifying  every 
observation  as  a  non-mishap.  There  is  no  critical  probability 
that  would  partition  the  decision  space  that  will  result  in  an 
acceptable  predictive  statistical  model  while  minimizing 
errors . 
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Figure  12 .  Plot  of  the  log  odds  of  non-mishap  and 
mishap  observations  produced  by  logistic  regression. 


This  predictive  statistical  model  is  obviously  not  useful 
since  to  forecast  a  high  percentage  of  mishaps,  almost  all  of 
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Che  squadrons  would  have  to  be  told  that  they  are  at  a  high 
risk  of  having  a  mishap.  Obviously,  if  all  the  squadrons  are 
told  that  they  are  at  risk,  then  the  predictive  statistical 
model  will  soon  be  disregarded. 

C .  DATA  MANIPULATION 

Since  all  of  the  preceding  detailed  analysis  failed  to 
provide  an  acceptable  predictive  statistical  model  to  forecast 
mishaps,  an  attempt  to  define  a  model  was  made  by  using 
different  subsets  of  the  original  data.  As  stated  in  the  data 
chapter,  ail  mishaps  were  included  in  the  original  analysis, 
except  for  birdstrike  mishaps. 

Since  all  the  variables  were  maintenance  related,  the 
first  transformat  ion  eliminated  all  pilot  error  mishap 
observations,  so  that  only  mishaps  that  involved  material 
failure  or  maintenance  personnel  error  were  analyzed.  All 
other  observations  were  considered  as  non-mishap  observations. 

The  second  transformation  took  the  above  transformation 
and  further  eliminated  all  Class  B  and  Class  C  mishap 
observations.  This  transformation  resulted  in  a  data  set  of 
maintenance  related  Class  A  mishaps.  All  other  observations 
were  considered  as  non-mishap  observations. 

Neither  of  the  above  transfoimations  lead  to  any 
difference  in  the  outcome  of  the  analysis. 
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V.  SUMMARY,  CONCLUSIONS,  AND  RBCOMMBNDATIONS 


A.  SUMMARY 

This  thesis  has  examined  the  relationship  between  existing 
monthly  maintenance  reports  and  aircraft  mishaps.  The 
reported  monthly  maintenance  and  personnel  variables  were 
analyzed  to  determine  if  any  combination  of  the  variables 
could  be  used  to  describe  a  predictive  statistical  model  that 
can  classify  a  squadron  as  a  mishap  or  non-mishap  squadron  in 
the  upcoming  month. 

Based  upon  a  graphical  analysis  there  were  no  obvious  one 
or  two  dimensional  relationships  that  could  be  used  to 
classify  a  mishap  squadron.  The  further  techniques  of 
principal  components  and  logistic  regression  did  not  produce 
any  higher  order  relationships  that  could  be  used  to  classify 
a  mishap  squadron. 

B.  CONCLUSIONS 

Based  on  this  particular  analyzed  data  there  apparently  is 
no  relationship  between  existing  monthly  maintenance  reports 
and  aircraft  mishaps.  This  result  might  indicate  that  with 
this  particular  data  there  is  no  existing  relationship,  or  it 
might  indicate  that  a  monthly  generated  report  may  not  be 
helpful  in  predicting  an  aircraft  mishap.  The  fact  that  the 
data  is  reported  at  the  end  of  the  month  could  possible 
conceal  any  subtle  useful  changes  or  indications  that  could  be 
exploited  to  forecast  aircraft  mishaps. 

C .  RECOMMENDATIONS 

This  thesis  indicates  that  there  is  no  relationship 
between  existing  monthly  maintenance  reports  and  aircraft 
mishaps  that  could  be  used  in  developing  a  predictive 
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statistical  model  to  classify  a  squadron  as  a  mishap  or  non¬ 
mishap  squadron. 

Two  alternative  recommendations  are  evident.  The  first 
alternative  is  to  accept  that  there  may  be  no  exploitable 
relationship  between  monthly  maintenance  reports  and  aircraft 
mishaps  and  focus  elsewhere  to  determine  a  predictive 
statistical  model  that  forecasts  aircraft  mishaps.  The  second 
alternative  recommendation  is  that  further  analysis  be  done, 
possibly  attempting  to  use  daily  maintenance  reports  versus 
monthly  maintenance  reports,  to  describe  a  predictive 
statistical  model  that  forecasts  aircraft  mishaps. 
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MISHAP  SQUADRONS 
NON  MISHAP  SQUADRONS 


MISHAP  SQUADRONS 
NON  MISHAP  SQUADRONS 


HOURS  PARm  MISSION  CAPABLE  MAINTENANCE- UNSCHEDULED  HOURS  FUa  MISSION  CAPABLE  MAINTENANCE- UNSCHEDULED 
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saubos  JO  lOGfinN 


oannaaHos-aoNVNaiNivri  nsvdvo  Nosstn  ion 


aiGHT  HOURS 


(mncaH0SNn-30NVN3JLNivw  Twvdwo  Noisstn  ION  ATcWns-aoNVNaiNivn  aievdvo  Nossm  iON 


annaaHosNn-aoNVNaiNivN  risvdvo  noissin  ion 


Aiddns-aoMyNaiNivM  navdvo  Noissin  ion 


ASSIGNED  NUMBER  OF  AIRCRAFT 


M.C.  MAINTENANCE-UNSCHEOUL£D  aia  M.C.  MAINTENANCE-UNSCHEDULED 


MISHAP  SQUADRONS 
NON  MISHAP  SQUADRONS 


LOG  HOURS  NOT  MISSION  CAPABLE  MAINTENANCE-UNSCHEOUl£D  LOG  NOT  MISSION  CAPABLE  MAINTENANCE-SUPPLY 


MISHAP  SQUADRONS 
NON  MISHAP  SQUADRONS 


LOG  MAINTENANCE  MANMOURS  PER  FUGHT  HOUR 


FUGHT  HOURS  PER  AIRCRAFT  LOG  MAINTENANCE  MAN  HOURS  PER  FUGHT  HOUR 
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